Arabase - A Database Combining Different Arabic Resources with Lexical and Semantic Information
نویسندگان
چکیده
Language resources are important factor in any NLP application. However, the language resource support for Arabic is poor because the existing Arabic language resources are either scattered, inconsistent or even incomplete. In this paper we discuss the notion of having an integrated Arabic resource leveraging various pre-existing ones. We present a comparison between these resources then we present preliminary fully and semi-automated methods to integrate these resources. This work serves as a bootstrapping for a rich ArabicArabic resource with a good potential to interface with WordNet.
منابع مشابه
Preliminary Lexical Framework For English-Arabic Semantic Resource Construction
This paper describes preliminary work concerning the creation of a Framework to aid in lexical semantic resource construction. The Framework consists of 9 stages during which various lexical resources are collected, studied, and combined into a single combinatory lexical resource. To evaluate the general Framework it was applied to a small set of English and Arabic resources, automatically comb...
متن کاملThe Cornetto database: architecture and alignment issues of combining lexical units, synsets and an ontology
Cornetto is a two-year Stevin project (project number STE05039) in which a lexical semantic database is built that combines Wordnet with Framenet-like information for Dutch. The combination of the two lexical resources (the Dutch wordnet and the Referentie Bestand Nederlands) will result in a much richer relational database that may improve natural language processing (NLP) technologies, such a...
متن کاملARABASE: A Relational Database for Arabic OCR Systems
In this paper we present a database for the research of Arabic off-line and on-line handwriting optical recognition as well as for machine printed text optical recognition. Digital images of documents, text phrases, words/sub-words, isolated characters, digits, signatures, soon are and included in ARABASE. Data corresponds to a variety of lexes (cities names, literal amounts, isolated character...
متن کاملAutomatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کاملDeveloping a Semantic Similarity Judgment Test for Persian Action Verbs and Non-action Nouns in Patients With Brain Injury and Determining its Content Validity
Objective: Brain trauma evidences suggest that the two grammatical categories of noun and verb are processed in different regions of the brain due to differences in the complexity of grammatical and semantic information processing. Studies have shown that the verbs belonging to different semantic categories lead to neural activity in different areas of the brain, and action verb processing is r...
متن کامل